LIDS Report 2779 



1 



c3 



Constrained Consensus and Optimization in 
Mult i- Agent Networks *^ 

66 " 

Angelia NedicJ Asuman Ozdaglar, and Pablo A. Parrilo* 
o : December 17, 2008 

Q 

t> , 

Abstract 

' We present distributed algorithms that can be used by multiple agents to align 

. their estimates with a particular value over a network with time-varying connec- 

ts ! tivity. Our framework is general in that this value can represent a consensus value 

among multiple agents or an optimal solution of an optimization problem, where 
the global objective function is a combination of local agent objective functions. 
Our main focus is on constrained problems where the estimate of each agent is 
restricted to lie in a different constraint set. 
J> ■ To highlight the effects of constraints, we first consider a constrained consen- 

sus problem and present a distributed "projected consensus algorithm" in which 
agents combine their local averaging operation with projection on their individ- 
C<*) ■ ual constraint sets. This algorithm can be viewed as a version of an alternating 

. projection method with weights that are varying over time and across agents. We 

| establish convergence and convergence rate results for the projected consensus al- 

gorithm. We next study a constrained optimization problem for optimizing the 
sum of local objective functions of the agents subject to the intersection of their 
local constraint sets. We present a distributed "projected subgradient algorithm" 
which involves each agent performing a local averaging operation, taking a subgra- 
■ dient step to minimize its own objective function, and projecting on its constraint 

set. We show that, with an appropriately selected stepsize rule, the agent estimates 
generated by this algorithm converge to the same optimal solution for the cases 
when the weights are constant and equal, and when the weights are time-varying 
but all agents have the same constraint set. 
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1 Introduction 



There has been much interest in distributed cooperative control problems, in which 
several autonomous agents collectively try to achieve a global objective. Most focus 
has been on the canonical consensus problem, where the goal is to develop distributed 
algorithms that can be used by a group of agents to reach a common decision or agree- 
ment (on a scalar or vector value). Recent work also studied multi-agent optimization 
problems over networks with time-varying connectivity, where the objective function 
information is distributed across agents (e.g., the global objective function is the sum of 
local objective functions of agents). Despite much work in this area, the existing liter- 
ature does not consider problems where the agent values are constrained to given sets. 
Such constraints are significant in a number of applications including motion planning 
and alignment problems, where each agent's position is limited to a certain region or 
range, and distributed constrained multi-agent optimization problems. 

In this paper, we study cooperative control problems where the values of agents are 
constrained to lie in closed convex sets. Our main focus is on developing distributed 
algorithms for problems where the constraint information is distributed across agents, 
i.e., each agent only knows its own constraint set. To highlight the effects of different local 
constraints, we first consider a constrained consensus problem and propose a projected 
consensus algorithm that operates on the basis of local information. More specifically, 
each agent linearly combines its value with those values received from the time-varying 
neighboring agents and projects the combination on its own constraint set. We show 
that this update rule can be viewed as a version of the alternating projection method 
where, at each iteration, the values are combined using weights that are varying in time 
and across agents, and projected on the respective constraint sets. 

We provide convergence and convergence rate analysis for the projected consensus 
algorithm. Due to the projection operation, the resulting evolution of agent values has 
nonlinear dynamics, which poses challenges for the analysis of the algorithm's conver- 
gence properties. To deal with the nonlinear dynamics in the evolution of the agent 
estimates, we decompose the dynamics into two parts: a linear part involving a time- 
varying averaging operation and a nonlinear part involving the error due to the projection 
operation. This decomposition allows us to represent the evolution of the estimates using 
linear dynamics and decouples the analysis of the effects of constraints from the conver- 
gence analysis of the local agent averaging. The linear dynamics is analyzed similarly 
to that of the unconstrained consensus update, which relies on convergence of transi- 
tion matrices defined as the products of the time-varying weight matrices. Using the 
properties of projection and agent weights, we prove that the projection error diminishes 
to zero. This shows that the nonlinear parts in the dynamics are vanishing with time 
and, therefore, the evolution of agent estimates is "almost linear". We then show that 
the agents reach consensus on a "common estimate" in the limit and that the common 
estimate lies in the intersection of the agent individual constraint sets. 

We next consider a constrained optimization problem for optimizing a global objec- 
tive function which is the sum of local agent objective functions, subject to a constraint 
set given by the intersection of the local agent constraint sets. We focus on distributed 
algorithms in which agent values are updated based on local information given by the 
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agent's objective function and constraint set. In particular, we propose a distributed 
projected subgradient algorithm, which for each agent involves a local averaging opera- 
tion, a step along the subgradient of the local objective function, and a projection on 
the local constraint set. 

We study the convergence behavior of this algorithm for two cases: when the con- 
straint sets are the same, but the agent connectivity is time-varying; and when the 
constraint sets Xi are different, but the agents use uniform and constant weights in each 
step, i.e., the communication graph is fully connected. We show that with an appro- 
priately selected stepsize rule, the agent estimates generated by this algorithm converge 
to the same optimal solution of the constrained optimization problem. Similar to the 
analysis of the projected consensus algorithm, our convergence analysis relies on showing 
that the projection errors converge to zero, thus effectively reducing the problem into an 
unconstrained one. However, in this case, establishing the convergence of the projection 
error to zero requires understanding the effects of the subgradient steps, which compli- 
cates the analysis. In particular, for the case with different constraint sets but uniform 
weights, the analysis uses an error bound which relates the distances of the iterates to 
individual constraint sets with the distances of the iterates to the intersection set. 

Related literature on parallel and distributed computation is vast. Most literature 
builds on the seminal work of Tsitsiklis [26] and Tsitsiklis et al. [27] (see also [3]), which 
focused on distributing the computations involved with optimizing a global objective 
function among different processors (assuming complete information about the global 
objective function at each processor). More recent literature focused on multi-agent 
environments and studied consensus algorithms for achieving cooperative behavior in 
a distributed manner (see [28], [12], [6], [21], [7], and [221 122]). These works assume 
that the agent values can be processed arbitrarily and are unconstrained. Another re- 
cent approach for distributed cooperative control problems involve using game-theoretic 
models. In this approach, the agents are endowed with local utility functions that lead 
to a game form with a Nash equilibrium which is the same as or close to a global opti- 
mum. Various learning algorithms can then be used as distributed control schemes that 
will reach the equilibrium. In a recent paper, Marden et al. [2] used this approach 
for the consensus problem where agents have constraints on their values. Our projected 
consensus algorithm provides an alternative approach for this problem. 

Most closely related to our work are the recent papers [HI [17], which proposed dis- 
tributed subgradient methods for solving unconstrained multi-agent optimization prob- 
lems. These methods use consensus algorithms as a mechanism for distributing com- 
putations among the agents. The presence of different local constraints significantly 
changes the operation and the analysis of the algorithms, which is our main focus in this 
paper. Our work is also related to incremental subgradient algorithms implemented over 
a network, where agents sequentially update an iterate sequence in a cyclic or a random 
order [H [151 [2H US]- In an incremental algorithm, there is a single iterate sequence and 
only one agent updates the iterate at a given time. Thus, while operating on the basis of 
local information, incremental algorithms differ fundamentally from the algorithm stud- 
ied in this paper (where all agents update simultaneously). Furthermore, the work in 
[H US [2H [13] assumes that the constraint set is known by all agents in the system, which 
is in a sharp contrast with the algorithms studied in this paper (our primary interest 
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is in the case where the information about the constraint set is distributed across the 
agents). 

The paper is organized as follows. In Section F5J we introduce our notation and 
terminology, and establish some basic results related to projection on closed convex sets 
that will be used in the subsequent analysis. In Section [3], we present the constrained 
consensus problem and the projected consensus algorithm. We describe our multi-agent 
model and provide a basic result on the convergence behavior of the transition matrices 
that govern the evolution of agent estimates generated by the algorithms. We study the 
convergence of the agent estimates and establish convergence rate results for constant 
uniform weights. Section H] introduces the constrained multi-agent optimization problem 
and presents the projected subgradient algorithm. We provide convergence analysis for 
the estimates generated by this algorithm. Section [5] contains concluding remarks and 
some future directions. 

2 Notation, Terminology, and Basics 

A vector is viewed as a column, unless clearly stated otherwise. We denote by x^ or [x]i 
the i-th component of a vector x. When Xi > for all components i of a vector x, we 
write x > 0. We write x' to denote the transpose of a vector x. The scalar product of 
two vectors x and y is denoted by x'y. We use ||x|| to denote the standard Euclidean 
norm, ||x|| = \Jx'x. 

A vector a e MJ 71 is said to be a stochastic vector when its components aj are non- 
negative and their sum is equal to 1, i.e., Yl^Li a j — 1- A se ^ °f m vectors {a 1 , . . . , a m }, 
with a 1 G M m for all i, is said to be doubly stochastic when each a 1 is a stochastic vector 
and YllLi a j = 1 for all j = 1, . . . , m. A square matrix A is said to be doubly stochastic 
when its rows are stochastic vectors, and its columns are also stochastic vectors. 

We write dist(x,X) to denote the standard Euclidean distance of a vector x from a 
set X, i.e., 

distfx, X) = inf \\x — x\\. 

xex ' 

We use -Px[^] to denote the projection of a vector x on a closed convex set X, i.e., 

-fx 1^1 = argmin \\x — sell. 

In the subsequent development, the properties of the projection operation on a closed 
convex set play an important role. In particular, we use the projection inequality, i.e., 
for any vector x, 

(Px[x\ -x)'{y-Px[x}) > forallyeX (1) 
We also use the standard non-expansiveness property, i.e., 

H-PxM — -PxMII < \\% — y\\ for any x and y. (2) 
In addition, we use the properties given in the following lemma. 
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Figure 1: Illustration of the relation between the projection error and feasible directions of a 
convex set at the projection vector. 

Lemma 1 Let X be a nonempty closed convex set in ¥L n . Then, we have for any x G R n , 

(a) (P x [x\ - x)' {x - y) < -\\P x [x] - x\\ 2 for all y G X. 

(b) \\P x [x] - y\\ 2 < \\x - y\\ 2 - \\P x [x] - x\\ 2 for all y G X. 
Proof, (a) Let x G M n be arbitrary. Then, for any y G X, we have 

{P x [x\ - x) 1 (x-y) = {P x [x\ - x) 1 (x - P x [x}) + {P x [x\ - x) 1 {P x [x\ - y). 

By the projection inequality [cf. Eq. ([!])], it follows that (-Px"N — x)' {Px[x\ — y) < 0, 
implying 

(Px[x\ - x)' (x-y) < -\\P x [x] - x\\ 2 for all y G X. 
(b) For an arbitrary and for all y G X, we have 

\\Px[x] - y\\ 2 = \\Px[x] - x + x - y\\ 2 

= \\P x [x] - x\\ 2 + \\x - y\\ 2 + 2(P x [x] - x)'(x - y). 

By using the inequality of part (a), we obtain 

ll-Px'M — y\\ 2 < ||^ — y\\ 2 ~ \\Px[x] — x\\ 2 for all y G X. 



Part (b) of the preceding Lemma establishes a relation between the projection er- 
ror vector and the feasible directions of the convex set X at the projection vector, as 
illustrated in Figure [2j 

We next consider nonempty closed convex sets Xi C M n , for i = 1, . . . ,m, and an 
averaged-vector x obtained by taking an average of vectors Xi G X t , i.e., x — — Y^iLi x i 
for some Xi G Xj. We provide an "error bound" that relates the distance of the averaged- 
vector x from the intersection set X = fl™ x Xi to the distance of x from the individual 
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sets Xi. This relation, which is also of independent interest, will play a key role in 
our analysis of the convergence of projection errors associated with various distributed 
algorithms introduced in this paper. We establish the relation under an interior point 
assumption on the intersection set X = n™ ]Xj stated in the following: 

Assumption 1 Given sets Xi C MJ 1 , % — 1, . . . , m, let X — fl^Xj denote their inter- 
section. There is a vector x G int(X), i.e., there exists a scalar 5 > such that 

{z | \\z - x\\ < 5} C X. 



We provide an error bound relation in the following lemma. 

Lemma 2 Let Xi C W 1 , i = l,...,m, be nonempty closed convex sets that satisfy 
Assumption [TJ Let x % G Xi, % = 1, . . . , m, be arbitrary vectors and define their average 
as x = — , x\ Consider the vector s G R n defined by 



e 5 A 
x H x, 



e+5 e+5 
where 

m 

e = dist(£, Xj), 

and S is the scalar given in Assumption [TJ 

(a) The vector s belongs to the intersection set X = n^Xj. 

(b) We have the following relation 



x-s\\< — -x||j(j]dist(^A j 



5m V 

j'=i i=i 
As a particular consequence, we have 



dist(x,X) < — (^2\\x j (j^dist^X,-)). 



Proof. 

(a) We first show that the vector s belongs to the intersection X = To see this, 

let z G {1, . . . , m} be arbitrary and note that we can write s as 
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By the definition of e, it follows that \\x — Pxi[x}\\ < e, implying by the interior point 

assumption (cf. Assumption [1]) that the vector x + ^{x — Pxi[x\j belongs to the set X, 

and therefore to the set Xi. Since the vector s is the convex combination of two vectors 
in the set Xi, it follows by the convexity of Xi that s E Xi. The preceding argument is 
valid for an arbitrary i, thus implying that s G X. 



(b) Using the definition of the vector s and the vector x, we have 



\x — s\ 



e + S 



1 m 

m ^— ' 



x J — x 



< — Y 



x J — x II. 



Substituting the definition of e yields the desired relation. 



3 Constrained Consensus 

In this section, we describe the constrained consensus problem. In particular, we in- 
troduce our multi-agent model and the projected consensus algorithm that is locally 
executed by each agent. We provide some insights about the algorithm and we discuss 
its connection to the alternating projections method. We also introduce the assump- 
tions on the multi-agent model and present key elementary results that we use in our 
subsequent analysis of the projected consensus algorithm. In particular, we define the 
transition matrices governing the linear dynamics of the agent estimate evolution and 
give a basic convergence result for these matrices. The model assumptions and the 
transition matrix convergence properties will also be used for studying the constrained 
optimization problem and the projected subgradient algorithm that we introduce in 
Section [U 



3.1 Multi- Agent Model and Algorithm 

We consider a set of agents denoted by V = {1, ... , m}. We assume a slotted-time sys- 
tem, and we denote by x l (k) the estimate generated and stored by agent i at time slot k. 
The agent estimate x l (k) is a vector in IR n that is constrained to lie in a nonempty closed 
convex set X, C M. n known only to agent %. The agents' objective is to cooperatively 
reach a consensus on a common vector through a sequence of local estimate updates 
(subject to the local constraint set) and local information exchanges (with neighboring 
agents only). 

We study a model where the agents exchange and update their estimates as follows: 
To generate the estimate at time k + 1, agent i forms a convex combination of its estimate 
x l {k) with the estimates received from other agents at time k, and takes the projection 
of this vector on its constraint set X^. More specifically, agent i at time k + 1 generates 
its new estimate according to the following relation: 



x\k + l) = P- 



x, 



5>}(*y(*) 



(3) 



7 



where a 1 = (a\, . . . , a^)' is a vector of nonnegative weights. 

The relation in Eq. ([3]) defines the projected consensus algorithm. The method can 
be interpreted as a multi-agent algorithm for finding a point in common to the given 
closed convex sets X%, . . . ,X m . Note that the problem of finding a common point can 
be formulated as an unconstrained convex optimization problem of the following form: 

minimize \ Y™Li \\ x ~ p xA x )\\ 2 m 
subject to x e W n . K ' 

In view of this optimization problem, the method can be interpreted as a distributed gra- 
dient algorithm where each agent is assigned an objective function fi(x) = | \\x — Px t [x] || 2 . 
At each time k + 1, an agent incorporates new information x^(k) received from some of 
the other agents and generates a weighted sum Y^j=i &){k)xi{k). Then, the agent up- 
dates its estimate by taking a step (with stepsize equal to 1) along the negative gradient 
of its own objective function fi — \ \\x — -PxJ 2 at x = Y^T=i d % j{k)x^ {k) . In particular, 
since the gradient of fi is V/i(x) = x — Px^x] (see Theorem 1.5.5 in Facchinei and Pang 
[TO]), the update rule in Eq. ([3]) is equivalent to the following gradient descent method 
for minimizing fi. 



x\k + 1) = Y, a )( k ) xj ( k ) - ( XX^w - Px > 



This view of the update rule motivates our line of analysis of the projected consensus 
method. In particular, motivated by the objective function of problem (jl]), we use 
YmLi \\x l (k) — x\\ 2 with x G n™_iXj as a Lyapunov function measuring the progress of 
the algorithm (see Section l3~6l) I 1 ! 



3.2 Relation to Alternating Projections Method 

The method of Eq. ([3]) is related to the classical alternating or cyclic projection method. 
Given a finite collection of closed convex sets {Xj}j g j with a nonempty intersection 
(i.e., (li^jXi ^ 0), the alternating projection method finds a vector in the intersection 
rijgiXj. In other words, the algorithm solves the unconstrained problem (jl]). Alternating 
projection methods generate a sequence of vectors by projecting iteratively on the sets 
(either cyclically or with some given order), see Figure [2]^a). The convergence behavior 
of these methods has been established by Von Neumann [20] and Aronszajn [T] for the 
case when the sets Xi are affine; and by Gubin et al. [11] when the sets Xi are closed and 
convex. Gubin et al. [11] also have provided convergence rate results for a particular 
form of alternating projection method. Similar rate results under different assumptions 
have also been provided by Deutsch [H], and Deutsch and Hundal [9]. 

The constrained consensus algorithm [cf. Eq. ([3])] generates a sequence of iterates 
for each agent as follows: at iteration k, each agent i first forms a linear combination 

1 We focus throughout the paper on the case when the intersection set <~)™ =1 X i is nonempty. If the 
intersection set is empty, it follows from the definition of the algorithm that the agent estimates will 
not reach a consensus. In this case, the estimate sequences {x l (k)} may exhibit oscillatory behavior or 
may all be unbounded. 
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Figure 2: Illustration of the connection between the alternating/cyclic projection method 
and the constrained consensus algorithm for two closed convex sets X\ and X2. In plot (a), 
the alternating projection algorithm generates a sequence {x(k)} by iteratively projecting 
onto sets X\ and X%, i.e., x(k + 1) = Px 1 [x(k)], x(k + 2) = Px 2 [x(k + 1)]. In plot (b), 
the projected consensus algorithm generates sequences {x l (k)} for agents i = 1,2 by first 
combining the iterates with different weights and then projecting on respective sets Xi, i.e., 
= ££=1 a)(k)xi(k) and x'(k + 1) = Px ( [«>*(*)] for i = 1,2. 

of the other agent values a;- 7 (A;) using its own weight vector a*(/c) and then projects this 
combination on its constraint set X^. Therefore, the projected consensus algorithm can 
be viewed as a version of the alternating projection algorithm, where the iterates are 
combined with the weights varying over time and across agents, and then projected on 
the individual constraint sets, see Figure [2(b). 

We conclude this section by noting that the alternate projection method has much 
more structured weights than the weights we consider in this paper. As seen from the 
assumptions on the agent weights in the following section, the analysis of our projected 
consensus algorithm (and the projected subgradient algorithm introduced in Section HJ) 
is complicated by the general time variability of the weights a l Ak). 

3.3 Assumptions 

Following Tsitsiklis [26] (see also Blondel et al. [5]), we adopt the following assumptions 
on the weight vectors a l (k), % G {1, . . . , m} and on information exchange. 

Assumption 2 (Weights Rule) There exists a scalar rj with < 77 < 1 such that for all 
% G {1, . . . ,m}, 

(a) a\(k) > rj for all k > 0. 

(b) If a)(k) > 0, then a)(k) > 77. 

Assumption 3 (Doubly Stochasticity) The vectors a % {k) = (a\(k), . . . , a % m {k))' satisfy: 
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(a) a l (k) > and X]j=i a j(^) = ^ ^ or a ^ * an< ^ ^> *' e '' ^ ne vec t° rs ct l {k) are stochastic. 

(b) YlT=i a j(^) = 1 f° r au J an d 

Informally speaking, Assumption [2] says that every agent assigns a substantial weight 
to the information received from its neighbors. This guarantees that the information 
from each agent influences the information of every other agent persistently in time. 
In other words, this assumption guarantees that the agent information is mixing at 
a nondiminishing rate in time. Without this assumption, information from some of 
the agents may become less influential in time, and in the limit, resulting in loss of 
information from these agents. 

Assumption [3]^a) establishes that each agent takes a convex combination of its esti- 
mate and the estimates of its neighbors. Assumption [3](b), together with Assumption 
[21 ensures that the estimate of every agent is influenced by the estimates of every other 
agent with the same frequency in the limit, i.e., all agents are equally influential in the 
long run. 

We now impose some rules on the agent information exchange. At each update time 
tfc, the information exchange among the agents may be represented by a directed graph 
(V, E k ) with the set E k of directed edges given by 

E k = I a){k) > 0}. 

Note that, by Assumption [2](a), we have (i,i) G E k for each agent i and all k. Also, we 
have (j, i) G E k if and only if agent i receives the information x 3 from agent j in the 
time interval (t^, ^fc+i)- 

We next formally state the connectivity assumption on the multi-agent system. This 
assumption ensures that the information of any agent i influences the information state 
of any other agent infinitely often in time. 

Assumption 4 (Connectivity) The graph (V, -Boo) is strongly connected, where E^ is 
the set of edges (j, i) representing agent pairs communicating directly infinitely many 
times, i.e., 

Eoo = {(j, i) | (j, i) G E k for infinitely many indices k}. 

We also adopt an additional assumption that the intercommunication intervals are 
bounded for those agents that communicate directly. In particular, this is stated in the 
following. 

Assumption 5 (Bounded Intercommunication Interval) There exists an integer B > 1 
such that for every (j, i) G -Eoo, agent j sends its information to a neighboring agent i 
at least once every B consecutive time slots, i.e., at time t k or at time t k +i or ... or (at 
latest) at time t k +B-i for any k > 0. 

In other words, the preceding assumption guarantees that every pair of agents that 
communicate directly infinitely many times exchange information at least once every B 
time slotsJl 

2 It is possible to adopt weaker connectivity assumptions for the multi-agent model as those used in 
the recent work [TB] , 



10 



3.4 Transition Matrices 



We introduce matrices A(s), whose i-th column is the weight vector a l (s), and the 
matrices 

$(fc, s) = A(s)A(s + 1) • • ■ A(k - l)A(k) for all s and k with k > s, 

where 

$(fc, k) = A(k) for all k. 

We use these matrices to describe the evolution of the agent estimates associated with the 
algorithms introduced in Sections [3] and HI The convergence properties of these matrices 
as k — > oo have been extensively studied and well-established (see [26], [12], [29]). Under 
the assumptions of Section [3^3| the matrices s) converge as k — > oo to a uniform 
steady state distribution for each s at a geometric rate, i.e., lim^oo s) = — ee' for 
all s. The fact that transition matrices converge at a geometric rate plays a crucial 
role in our analysis of the algorithms. Recent work has established explicit convergence 
rate results for the transition matrices [TBI HZ]- These results are given in the following 
proposition without a proof. 

Proposition 1 Let Assumptions [21 [31 H] and [5] hold. Then, we have the following: 

(a) The entries [<&(&, s)]*- of the transition matrices converge to ^ as k — > oo with a 
geometric rate uniformly with respect to i and j, i.e., for all i,j 6 {1, . . . , m}, 



[*(*,*)]• 



1 + fl~B(> k-s 

< 2 (l - ri Bo ) s ° for all s and k with fc > s. 



(b) In the absence of Assumption [3](b) [i.e., the weights a l (k) are stochastic but not 
doubly stochastic], the columns [$(/c, s)] 1 of the transition matrices converge to a 
stochastic vector <fr(s) as k — > oo with a geometric rate uniformly with respect to 
z and j, i.e., for all z, j € {1, . . . 

1 + 77 _B ° >Lz± 

| [$(Jfe, s)]} - <f>j{s) | < 2 - — (1 - r/ Bo ) B « for all s and k with fc > s. 

Here, r\ is the lower bound of Assumption [21 B = (m — 1)5, m is the number of agents, 
and B is the intercommunication interval bound of Assumption [5j 



3.5 Convergence 

In this section, we study the convergence behavior of the agent estimates {x t (k)} gen- 
erated by the projected consensus algorithm (|2D under Assumptions [2HSJ We write the 
update rule in Eq. (j3J) as 

m 

x\k + l) = J2 a j( k > 3 (k) + e i {k), (5) 
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where e l (k) represents the error due to projection given by 



e\k) 



(*)*>'(*). (6) 



As indicated by the preceding two relations, the evolution dynamics of the esti- 
mates x l (k) for each agent is decomposed into a sum of a linear (time- varying) term 
Xlj=i CL l j{k)x^(k) and a nonlinear term e l (k). The linear term captures the effects of 
mixing the agent estimates, while the nonlinear term captures the nonlinear effects of 
the projection operation. This decomposition plays a crucial role in our analysis. As 
we will shortly see [cf. Lemma 13(d)], under the doubly stochasticity assumption on the 
weights, the nonlinear terms e l (k) are diminishing in time for each i, and therefore, the 
evolution of agent estimates is "almost linear" . Thus, the nonlinear term can be viewed 
as a non-persistent disturbance in the linear evolution of the estimates. 

For notational convenience, let w l (k) denote 

m 

w\k) = j2 ai j( k ) xj ( k )- ( 7 ) 

3=1 

In this notation, the iterate x l (k + 1) and the projection error e l {k) are given by 

x\k + l) = P Xi [iv\k)l (8) 

e\k)=x\k + l)-w\k). (9) 

In the following lemma, we show some relations for the sums Y^iLi \\ xt {k) — x \\ 2 an d 
Y^iLi \\w l {k) — x|| 2 , and YH=i \\x l (k) — x\\ and Y^iLx \\ w% {k) — f° r an arbitrary vector 
x in the intersection of the agent constraint sets. Also, we prove that the errors e l (k) 
converge to zero as k — > oo for all i. The projection properties given in Lemma [Hand 
the doubly stochasticity of the weights play crucial roles in establishing these relations. 

Lemma 3 Let the intersection set X = n^A^ be nonempty, and let Doubly Stochas- 
ticity assumption hold (cf. Assumption [3]). Let x t (k), w t (k), and e l {k) be defined by 
Eqs. ([TJ) (J9j) . Then, we have the following. 

(a) For all x G X and all k, we have 

(i) Wx^k + l) -x\\ 2 < Ww^k) - x\\ 2 - We^k)]] 2 for all z, 



fiiil 



YhLi \\w\k) - x\\ 2 < YhLi \\x\k) - x|| 2 , 
Ya=i \\ w% {k) - x\\ < Yh=i \\ xl (k) - 



arc 



(b) For all x G X, the sequences j Y^Li \\ wl (k) — ^|| 2 | and | Y1T=\ \\ xl {k) ~ x \\j 
monotonically nonincreasing with k. 

(c) For all x G X, the sequences j Y^iLi \\ wt {k) — x \\\ an d j YlT=i ll 2 - 1 ^) ~~ are 
monotonically nonincreasing with k. 



12 



(d) The errors e l (k) converge to zero as k — > oo, i.e., 

lim e l (k) = for all i. 

k—foo 



Proof, (a) For any i6l and i, we consider the term + 1) — x\\ 2 . Since X C Xj 

for all i, it follows that 16^ for all i. Since we also have x l (k + 1) = Px i {w l {k)]^ we 
have from Lemma [H(b) that 

\\x\k + 1) - a;|| 2 < \\w\k) - x|| 2 - \\x\k + 1) - u/(£;)|| 2 for all x E X and k > 0, 

which yields the relation in part (a)(i) in view of relation ([9]). 

By the definition of w l (k) in Eq. (J7J) and the stochasticity of the weight vector a l (k) 
[cf. Assumption G^a)], we have for every agent i and any 



w\k) - x = Y^ ~ x ) for a11 fc - °- 



(10) 



Thus, for any and all i and 

— x\\ 2 



J>}(fc) {xi(k)-x) 

3=1 



<^4(fc)||^(A;)- 
3=1 



where the inequality holds since the vector X]j=i o,j{k)(x^(k)— x) is a convex combination 
of the vectors x^(k) — x and the squared norm || • || 2 is a convex function. By summing 
the preceding relations over i — 1, . . . , m, we obtain 



m / m 



^|K(fc)-x|| 2 <^^4(A;)||x^(fc) 
i=i j=i 



— x\ 



i=i 



E E#)]H fc ) 

j=i \i=i 



Using the doubly stochasticity of the weight vectors a*(/c), i.e., Y^iLi a )(k) = 1 f° r a ^ i 
and k [cf. Assumption E(b)], we obtain the relation in part (a) (ii), 



^ik(fc) 



for all x G X and /c > 0. 



i=i 



Similarly, from relation (JTTJJ) and the doubly stochasticity of the weights, we obtain 
for all x G X and all k, 



J2h'\ k ) < E^4(A;)||x J '(A;) -x\\ =^2\\x j (k) 
%=\ j=i j=i 



x\ 



i=l 



thus showing the relation in part (a)(iii). 

(b) For any x E X, the nonincreasing properties of the sequences j Y^Li \\ w% {k) — x|| 2 j 
and j YlT=i ll 2 - 1 ^) — x \\j follow by combining the relations in parts (a) (i)— (ii) . 
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(c) Since x l (k + l) = Px i (w l (k)) for all % and k > 0, using the nonexpansiveness property 
of the projection operation [cf. Eq. (|2J)], we have 

\\x l (k + 1) — x\\ < ||u>*(A;) — x\\ for all x G X i7 all i and k. 

Summing the preceding relations over all z E {1, . . . , m} yields for all k, 

m m 

^\\x\k + 1) -x|| < E IKO) -^11 for all x eX. (11) 

i=i i=i 

The nonincreasing property of the sequences {Y^iLi \\w z (k) —x\\} and {X^Ili ll 37 '^) — 
follows from the preceding relation and the relation in part (a) (hi). 

(d) By summing the relations in part (a)(i) over % = 1, . . . , m, we obtain for any i6l, 

m mm 

E \\ xi ( k + !) - HI 2 ^ E IK( fc ) - x \\ 2 - E ii^wii 2 for a11 k ^ °- 

i=l i=l i=l 

Combined with the inequality J2f=i W^ify ~ x \\ 2 — \\ x K k ) ~ x \\ 2 °f P ar ^ (a) (ii) , 

we further obtain 

mm m 

E \\ ei ( k )f ^ E ll^w - x ll 2 - E \\ x ^ k + x ) - Hf for a11 fc - °- 

i=l i=l i=l 

Summing these relations over k — 0, . . . , s for any s > yields 

s m m m m 

E E ii ei ( s )n 2 ^ E - 4 2 - E IH s + x ) HI 2 < E - x ll 2 • 

fc=0 i=l i=l i=l i=l 

By letting s — ► oo, we obtain 

EEii e W^EIK(°)-HI 2 > 

fc=0 i=l i=l 

implying lim^oo ||e l (/c)|| = for all i. ■ 

We next consider the evolution of the estimates x l {k + 1) generated by method (j3J) 
over a period of time. In particular, we relate the estimates x l (k + 1) to the estimates 
x l (s) generated earlier in time s with s < k + 1 by exploiting the decomposition of 
the estimate evolution in Eqs. ©-(jS]). In this, we use the transition matrices s) 
from time s to time k (see Section 13.41) . As we will shortly see, the linear part of the 
dynamics is given in terms of the transition matrices, while the nonlinear part involves 
combinations of the transition matrices and the error terms from time s to time k. 

Recall that the transition matrices are defined as follows: 

s) = A(s)A(s + !)•■• A(k - l)A(k) for all s and k with k > s, 
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where 

$(fc, k) = A(k) for all k, 

and each A(s) is a matrix whose i-th column is the vector a l (s). Using these transition 
matrices and the decomposition of the estimate evolution of Eqs. (JSJ) — ((SJ) , the relation 
between x l {k + 1) and the estimates x 1 (s), . . . , x m (s) at time s < k is given by 

m k / m \ 

x\k+l) = J2Mk,s)})x\s)+ [JlMk,r)] i j e'(r-l) \+e%k). (12) 

j=l r=s+l \j=l / 

Here we can view e^(k) as an external perturbation input to the system. 

We use this relation to study the "steady-state" behavior of a related process. In 
particular, we define an auxiliary sequence {y(k)}, where y(k) is given by 

1 m 

y {k) = — V wHk) for all k. (13) 

i=i 

Since w l {k) = Y^j=i cij{k)x^ (k), under the doubly stochasticity of the weights, it follows 
that 

1 TO 

y(k) = — V x j (k) for all k. (14) 
m z — ' 

i=i 

Furthermore, from the relations in (j!2p using the doubly stochasticity of the weights, 
we have 



y(k) = -j2^(s) + -Y, E^- 1 ) • ( 15 ) 

m z — ' m z — ' \ z — ' / 

j=l r=s+l \j=l / 

We now show that the limiting behavior of the agent estimates x l (k) is the same as 
the limiting behavior of y(k) as k — > oo. We establish this result using the assumptions 
on the multi-agent model of Section 13.31 

Lemma 4 Let the intersection set X = r\ r ^l l X i be nonempty. Also, let Assumptions [2J 
El IU and [5] hold. We then have 

lim \\x l (k) - y(k)\\ = 0, lim ||u> l (fc) - y(k) \\ = for all i. 

k— too k—>oo 



Proof. By Lemma[3](d), we have e l (k) — > as k — > oo for all i. Therefore, for any e > 0, 
we can choose some integer s such that ||e l (fc)|| < e for all k > s and for all i. Using the 
relations in Eqs. (fl2"|) and (|T5|) . we obtain for all i and k > s + 1, 



l^(fc) - y(*OI 



m -. 



fe — 1 771 



r=s+l j=l j=l 
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m 1 
< £| [*(*-!, ,)]«-_ \\X>\ 

z — ' I J m 



3=1 
k— 1 m 



+ E EN*- 1 . *■)];■-- i' e ^ r 



r=s+l j'=l 



Using the estimates for 



|e«(*-l)|| + -EH* 



[$(/c — 1, s)]* — — of Proposition [H(a), we have 



1 + rj~ B ° 
1 - T]- 60 



fe— 1 -i d„ m 

1 I y-i k — l — r 

E 2 i^V (i-^^Eii^- 1 )!! 



r=s+l 



e^-^iK-EH*- 1 )!!- 

m z — ' 



(16) 



3=1 



Since ||e*(/c)|| < e for all k > s and for all i, from the preceding inequality we obtain 

1 + rj~ B ° 



\x*(k)-y(k)\\ < 2 



(l-^o) 
' 3=1 

1 + r]~ B ° 1 



-2me 



1-rjBo 1 _( 1 _ 7]Ba ^ 

Thus, by taking the limit superior as k — > oo, we see that 

1 + r/" Bo 1 



2e. 



limsup \\x\k) - y(k)\\ < 2me — — ^ — + 2e, 

which by the arbitrary choice of e, implies lim^oo — y(k)\\ = for all i. 

Consider now YlT=i ll^t^) — 2/(^)11- By using w l {k) = £]j=i Oj(k)x^(k) [cf. Eq. (j^J)] 
and the stochasticity of the vector a l (k), we have 

m mm 

j2\\^k)-ym<J2J24( k )\\ xj ( k )-y( k )\\- 

i=l i=l j=l 

By exchanging the order of the summations over i and j, and using the doubly stochas- 
ticity of a l (fc), we further obtain 

m m / m \ m 

ik(fc) - y(*)ii < E E a 5( fc ) - y( k )\\ = E - vWl ( 17 ) 

t=l 3=1 \i=l J j=l 

Since lim^^ — y{k)\\ = for all j, we have 

m 

lunY,W(k)-y(k)\\=0, 

fc^oo A — * 
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i=i 



implying Hindoo ||w/(fc) — y(k)\\ = for all i. ■ 

We next show that the agents reach a consensus asymptotically, i.e., the agent esti- 
mates x l (k) converge to the same point as k goes to infinity. 

Proposition 2 (Consensus) Let the set X = fl^Aj be nonempty. Also, let As- 
sumptions El El HI and [5] hold. For all i, let the sequence {x l {k)} be generated by the 
projected consensus algorithm ([3]). We then have for some x € X, 

lim ||x l (fc) — x\\ — lim — x\\ = 0, for all i. 

fc— >oo fc— >oo 



Proof. The proof idea is to consider the sequence {y(k)}, defined in Eq. ffl5|) . and show 
that it has a limit point in the set X. By using this and Lemma HI we establish the 
convergence of each w l (k) and x l (k) to x. 

To show that {y(k)} has a limit point in the set X, we first consider the sequence 

m 

J]dist(y(A;),X i ). 

j=i 

Since a? 7 ' (A;) G X, for all j and fc > 0, we have 

m m 

]Tdist(y(fc),X,) < \\y(k)-xi(k)\\. 

3=1 3=1 

Taking the limit as fc — * oo in the preceding relation and using Lemma HI we conclude 

lim Vdist(?/(fc),X i ) = 0. (18) 

fc-^oo ' J 
3=1 

For a given i£l, using Lemma|3(c), we have 

m m 

||x l (fc) - x\ < 11^(0) - x \\ for a11 k > 0- 



8=1 1=1 



This implies that the sequence {XXa ll^ 1 ^) ~~ x \\}> an d therefore each of the sequences 
{x l (k)} are bounded. Since for all i 

\\y{k)\\ < \\x\k) - y{k)\\ + \\x\k)\\ for all fc > 0, 

using Lemma HI it follows that the sequence {y(k)} is bounded. In view of Eq. ([TBI) , this 
implies that the sequence {y(k)} has a limit point x that belongs to the set X = njLjXj. 
Furthermore, because lim^oo ||w/(fc) — £/(fc)|| = for all i, we conclude that x is also a 
limit point of the sequence {w l (k)} for all i. Since the sum sequence I YlT=i ll u '*(^) ~ x \\ r 
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is nonincreasing by Lemma [3]^c) and since each {w l (k)} is converging to x along a 
subsequence, it follows that 

m 

lim H^X^) ~~ ^11 = 0' 

i=l 

implying lirn^oo ||u> l (fc) — x|| =0 for all i. Using this, together with the relations 
lim^oo — y{k)\\ = and linn^oo — y{k) \\ = for all i (cf. Lemma Hj), we 

conclude 

lim ||x l (/c) — x|| = for all i. 

k—>oo 



3.6 Convergence Rate 

In this section, we establish a convergence rate result for the iterates x l (k) generated by 
the projected consensus algorithm ([3]) for the case when the weights are time-invariant 
and equal, i.e., a l (k) = (1/m, . . . , 1/m)' for all i and k. In our multi-agent model, 
this case corresponds to a fixed and complete connectivity graph, where each agent is 
connected to every other agent. We provide our rate estimate under an interior point 
assumption on the sets X i; stated in Assumption [TJ 

We first establish a bound on the distance from the vectors of a convergent sequence 
to the limit point of the sequence. This relation holds for constant uniform weights, 
and it is motivated by a similar estimate used in the analysis of alternating projections 
methods in Gubin et al. [IT] (see the proof of Lemma 6 there). 

Lemma 5 Let Y be a nonempty closed convex set in W 1 . Let {u(k)} C 1™ be a 
sequence converging to some y EY, and such that \\u(k + 1) — y\\ < \\u(k) — y\\ for all 
y EY and all k. We then have 

\\u(k) - y\\ < 2 dist(u(fc), Y) for all k > 0. 

Proof. Let B(x,a) denote the closed ball centered at a vector x with radius a, i.e., 
B(x,a) = {z | \\z — x\\ < a}. For each /, consider the sets 

i 

St= f]B(Py[u{k)],dist{u{k),Y)y 

k=0 

The sets Si are convex, compact, and nested, i.e., Si + i C Si for all I. The nonincreasing 
property of the sequence {u(k)} implies that ||«(A;+ s) — Py[w(/c)]|| < \\u(k) — Py[u(k)]\\ 
for all k, s > 0; hence, the sets Si are also nonempty. Consequently, their intersection 
nfl Si is nonempty and every point y* G 0!j*L Q Si is a limit point of the sequence {u(k)}. 
By assumption, the sequence {u(k)} converges to y G Y, and therefore, H^Si = {y}. 
Then, in view of the definition of the sets Si, we obtain for all k, 

\\u{k)-y\\ < \\u(k) -Py[u(k)]\\ + \\P Y [u{k)} - y\\ < 2 dist(u(k), Y). 
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We now establish a convergence rate result for constant uniform weights. In partic- 
ular, we show that the projected consensus algorithm converges with a geometric rate 
under the Interior Point assumption. 

Proposition 3 Let Assumptions CD, El El HI and [5] hold. Let the weight vectors a l (k) 
in algorithm (J3j) be given by a l (k) = (1/m, . . . , 1/m)' for all i and k. For all i, let the 
sequence {x l (k)} be generated by the algorithm (j3J). We then have 

-5|| 2 < I 1 -7^ -5|| 2 forallfc>0, 

8=1 ^ ' 8=1 

where i 6 I is the limit of the sequence {x l (k)}, and i? = | Y^JiLi ||a?*(0) — x|| with x 
and 5 given in the Interior Point assumption. 

Proof. Since the weight vectors a l (k) are given by a l (k) = (1/m, . . . , 1/m)', it follows 
that 

j m 

u/(A;) = it)(fc) = — 7 x-'(fc) for all z, 
m z — ' 

i=i 

[see the definition of w l (k) in Eq. (j7|)]. For all k > 0, using Lemma 12(b) with the 
identification x* = x*(/c) for each i = 1, . . . ,m, and x = w(k), we obtain 

dist(w(k),X) <t^(Y1 H^'W - (E dist(w(A;), X^)) , 

where the vector x and the scalar 5 are given in Assumption [TJ Since x G X, the sequence 
{Si=i ||^(^) ~~ %\\} is nonincreasing by Lemma ^c). Therefore, we have X^=i \\ xl {k + 
1) _ %\\ < Si=i 11^(0) ~~ ^11 f° r a ^ ^- Defining the constant R — h YlT=i \\ x% ^) ~ ^11 an d 
substituting in the preceding relation, we obtain 

dist(™(fc),X) < —(^&ist(w{k),XM 

m j=l 

m. 

= -y^\\w(k)-x j (k+i)\\, (19) 

m ' 

3=1 

where the second relation follows in view of the definition of x^ik + 1) [cf. Eq. (jSJ)]. 

By Proposition [2], we have w(k) —> x for some x G X as k — > oo. Furthermore, by 
Lemma [3](c) and the relation w l (k) = w(k) for all % and fc, we have that the sequence 
— x||} is nonincreasing for any x G X. Therefore, the sequence {w(k)} satisfies 
the conditions of Lemma [5j and by using this lemma we obtain 

\\w(k) - x|| < 2 dist(w(k),X) for all k. 
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Combining this relation with Eq. (fT9|) . we further obtain 

2R 



(AO - x\\ < — V |KJfe) - z'fJfe + 1)11. 



m 

i=i 



Taking the square of both sides and using the convexity of the square function (-) 2 , we 
have 

\\w(k) - x\\ 2 < — Y \\ w (k) - xHk + 1)|| 2 . (20) 

m < J 



m 

1=1 



Since x l (k + 1) = Px i [w{k)] for all i and k, using Lemma [31(a) with the substitutions 
x = x G X and e l (k) = x l {k + 1) — w(k) for all i, we see that 

m m 

\\ w ( k ) - x K k + < m \\ w ( k ) - £ ir - Yl ii 3 ^* + x ) ~ ^ ii 2 for aU 

i=l i=l 

Using this relation in Eq. (|2"U|) . we obtain 



AR 2 

\w(k) - x\\ 2 < ( m \\wik) - x\\ 2 - \\x\k + 1) 



Rearranging the terms and using the relation m \\w(k) — x\\ 2 < Y^tLi \\x l {k) — x|| 2 [cf. 
Lemmata) with w(k) = w l (k) and x — x], we obtain 



^||x*(* + l)-x|| 2 <(l-^jEl^) 



xll 2 , 



which yields the desired result. ■ 

4 Constrained Optimization 

We next consider the problem of optimizing the sum of convex objective functions cor- 
responding to m agents connected over a time-varying topology. The goal of the agents 
is to cooperatively solve the constrained optimization problem 



m 

minimize 

i=i 



(21) 

i=i 

m 

subject to x e f]X h (22) 



i=l 



where each : R n — > R is a convex function, representing the local objective function 
of agent z, and each Xi C R n is a closed convex set, representing the local constraint set 
of agent i. We assume that the local objective function fi and the local constraint set 
Xi are known to agent % only. 
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To keep our discussion general, we do not assume differentiability of any of the func- 
tions /j. Since each /j is convex over the entire M n , the function is differentiable almost 
everywhere (see [2] or [25J). At the points where the function fails to be differentiable, 
a subgradient exists and can be used in the role of a gradient. In particular, for a given 
convex function F : IR n — > K and a point x, a subgradient of the function F at x is a 
vector Sp(x) G lR n such that 

F(x) + s F (x)'(x -x)< F(x) for all x. (23) 

The set of all subgradients of F at a given point x is denoted by dF(x), and it is referred 
to as the subdifferential set of F at x. 

4.1 Distributed Projected Subgradient Algorithm 

We introduce a distributed subgradient method for solving problem ( f2~Tl) using the as- 
sumptions imposed on the information exchange among the agents in Section 13.31 The 
main idea of the algorithm is the use of consensus as a mechanism for distributing the 
computations among the agents. In particular, each agent i starts with an initial esti- 
mate x l (0) G Xi and updates its estimate. An agent i updates its estimate by combining 
the estimates received from its neighbors, by taking a subgradient step to minimize its 
objective function f\, and by projecting on its constraint set Xi. Formally, each agent i 
updates according to the following rule: 

m 

v \k) = ^a){k)x ] {k) (24) 

x\k + 1) = P Xl [v*(k) - a k di(k)] , (25) 

where the scalars a* (k) are nonnegative weights and the scalar a k > is a stepsize. The 
vector di(k) is a subgradient of the agent i local objective function fi(x) at x = v l (k). 

We refer to the method f|2^T) - fT25T) as the projected subgradient algorithm. To analyze 
this algorithm, we find it convenient to re-write the relation for x l (k + 1) in an equivalent 
form. This form helps us identify the linear effects due to agents mixing the estimates 
[which will be driven by the transition matrices $(&,s)], and the nonlinear effects due 
to taking subgradient steps and projecting. In particular, we re-write the relations 
(EH) (ESD as follows: 

m 

v \k) = ^a)(k)x j (k) 
i=i 

x\k + l) = v'ity-akdiik) + </>'$) (26) 
P(k) = Pxi [v\k) - a k di(k)] - {v\k) - a k di(k)) . (27) 

The evolution of the iterates is complicated due to the nonlinear effects of the pro- 
jection operation, and even more complicated due to the projections on different sets. In 
our subsequent analysis, we study two special cases: 1) when the constraint sets are the 
same [i.e., Xi = X for all i], but the agent connectivity is time- varying; and 2) when the 



21 



constraint sets Xi are different, but the agent communication graph is fully connected. 
In the analysis of both cases, we use a basic relation for the iterates x l (k) generated by 
the method in (J2~7j) . The relation is established in the following lemma. 

Lemma 6 Let Assumptions [2] and [3] and hold. Let {x l (k)} be the iterates generated 
by the algorithm (EM]) - (125]) . We have for any z G X = and all k > 0, 



E n^c* + !) - z \\ 2 ^ E !!*'(*) - *f + «* E n*(*)ii 2 - 2ak E (/*(«'(*)) - /*(*)) 

i=l 

-Eiiwn : 



i=l 



Proof. Since + 1) = Px i [v' L {k) — akdi(k)}, it follows from LemmaH^b) and from the 
definition of the projection error <fi l (k) in ( 1271) that for any z G X, 

\\x\k + 1) - z\\ 2 < \\v\k) - a k di{k) - z\\ 2 - ||^(A;)|| 2 . 

By expanding the term — akdi(k) — z\\ 2 , we obtain 

\\v\k) - a k di{k) - z\\ 2 = \\v\k) - z\\ 2 + a 2 k \\di(k)\\ 2 - 2a k d l (k) 1 \v 1 '(k) - z). 

Since di{k) is a subgradient of fi(x) at x — v l (k), we have 

d i (ky(v i (k)-z)>f i (v i (k))-f i (z). 

By combining the preceding relations, we obtain 

\\Ak + 1) - z\\ 2 < Wv^k) -z\\ 2 + a 2 k \\d t (k)\\ 2 ~ 2« fc (fi(v*(k)) - f t (z)) - H\k)\\ 2 . 

Since v l (k) = Y^JjLi o, l j{k)x^(k), using the convexity of the norm square function and 
the stochasticity of the weights a^k), j — 1, . . . , m, it follows that 



z\\ 2 . 



|KA;)-z|| 2 <E^)lM^) 
Combining the preceding two relations, we obtain 

m 

+ i) _ z ||2 < J2°>fa)\\x j {k) - 4 2 + a*R(fc)H 2 " 2« fc (/M*0) - 
i=i 

-II^WH 2 . 

By summing the preceding relation over i — 1, . . . , m, and using the doubly stochasticity 
of the weights, i.e., 



m / m 



EE^wn^'w - ^ii 2 = E E°K fc ) n^'(fc) -*n 2 = E ii^c*) 

i=l j=l j=l \i=l J 3=1 

we obtain the desired relation. ■ 



z\\ 2 . 
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4.1.1 Convergence when X{ = X for all i 

We first study the case when all constraint sets are the same, i.e., Xi = X for all i. The 
next assumption formally states the conditions we adopt in the convergence analysis. 

Assumption 6 

(a) The constraint sets Xi are the same, i.e, Xi = X for a closed convex set X. 

(b) The subgradient sets of each fa are bounded over the set X, i.e., there is a scalar 
L > such that for all i, 

\\d\\ < L for all d G dfi(x) and all x e X. 



The subgradient boundedness assumption in part (b) holds for example when the set X 
is compact (see [2]). 

In proving our convergence results, we use a property of the infinite sum of products 
of the components of two sequences. In particular, for a scalar (3 G (0, 1) and a scalar 
sequence {jk}, we consider the "convolution" sequence ^2e=oP k ~ t 'Yi = P k lo + /3 fc_1 7i + 
• • • + P^k-i + Ik- We have the following result. 

Lemma 7 Let < /3 < 1 and let {7fc} be a positive scalar sequence. Assume that 
Hindoo 7fe = 0. Then 



k 

lim V/5 fc -% £ = 0. 

k— >oo ' J 

£=0 

In addition, if J2k^k < °°; t nen 



£5>*-s<oo. 



k £=0 

Proof. Let e > be arbitrary. Since — > 0, there is an index K such that < e for 
all k > K. For all /c > A + 1, we have 

fc X fc K k 

k-l 



E/^»=E^»+ E ""-'^"g^E E 3 

Since ELkt+i & k ~ e < 1^3 and 



1-/3' 



it follows that for all k > A' + 1 , 



^^-S<max7, T - r7j + I 



1-/3 



23 



Therefore, 

k 

lim sup 2^ P k ~ £ l£ < t 



Since e is arbitrary, we conclude that lim sup^^ X^=o P k ~ £ l£ — 0, implying 

k 



lim y"/3 fc "S = 0. 



1=0 

Suppose now Y^ fc 7fc < oo. Then, for any integer M > 1, we have 

E ( E J = E •?< E 3 ' ^E»A. 

fc=0 \e=o / £=0 t=0 £=0 H 

implying that 

oo / k \ 1 oo 

,j£ < OO. 



oo / k \ 1 oo 

E E^ Ut^E 

fc=0 V=0 / ^ 1=0 



Our goal is to show that the agent disagreements ||x*(/c) — ^(/c)!) converge to zero. 
To measure the agent disagreements ||x*(/c) — in time, we consider their average 

mEpi^Wi an d consider the agent disagreement with respect to this average. In 
particular, we define 



y(k) = — > xHk) for all k. 
m z — ' 

In view of Eq. fl26|) , we have 

1 m m _ m 

^ +1 ) = -E - f E + ^ E 

i=l i=l i=l 

When the weights are doubly stochastic, since v l (k) = Y^=i Ojifyx^k), it follows that 

m m 

y(k + 1) = y(k) _ 2* £>(fc) + - V <t>\k). (28) 

i=i i=i 

Under Assumption El the assumptions on the agent weights and connectivity stated 
in Section 13.31 and some conditions on the stepsize a^, the next lemma studies the 

convergence properties of the sequences < \\x l (k) — y(k)\\ \ for all i. 



Lemma 8 Let Assumptions El El [5j and[6]hold. Let {x l (k)} be the iterates generated 
by the algorithm (p4"j) -(125 ]) and consider the auxiliary sequence {y(k)} defined in (128]) . 

(a) If the stepsize satisfies lim^ocC^ = 0, then 

lim —y(k)\\ = for all i. 

k— >oo 
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(b) If the stepsize satisfies Y^k^oo ^ < °°i then 

oo 

2^ &k\\x l (k) — y(k)\\ < oo for all i. 



fc=l 



Proof, (a) Using the relations in (1271) and the transition matrices <&(k, s), we can write 
for all 2, and for all k and s with k > s, 



fe— 1 m 



X 



(k+i) = E[ $ (M]Ms)-EE[ $ ^ r+1 ^ 

r=s 2=1 



k— 1 m 

+ED*(*.' , +i)];y(r) + ^). 

r=s j'=l 

Similarly, using the transition matrices and relation (l28i) . we can write for y{k + 1) and 
for all k and s with k > s, 



fe— 1 m 



fc— 1 m 



y(* + 1) = y(«) ~ ^ E E Mi(r) - f E *(*) + ^ E E ^ r (0 + ^ E 

r=s 2 = 1 *=1 r=s .7=1 7=1 



r=s 2=1 



Therefore, since y(s) = ^ Y^j=i x3 X s ) > we nave f° r s = 0, 



!*«(*) -y(*)|| < E 



[*(*-l,0)]« 



J'=l 

fc— 2 m 
r=0 2=1 



«rlMi(r)|| 



+a A _ 1 ||d i (fc-l)|| + ^V||d i (A; 

771 z — ' 

3=1 

[$(fc-l,r + l)]}-- 



fc— 2 m 



r=0 2=1 



m 



\W{r)\\ 



U\k- 1)|| + -E 11^-1)11- 



i=i 



Using the estimate for [3>(fc, s)]*- — — of Proposition [U we have for all > s, 



[$(*,*)]«-- 



771 



< C/3 



k—s 



for all 2, j, 



with C = 2 ^ a ° and /3 = (l — r) Bo ) B o . Hence, using this relation and the subgradient 
boundedness, we obtain for all % and k > 2, 



fc-2 



X 



(k)-y(k)\\ < mC(3 k ~ l E 11^' (0)11 + mC P*"* <*r + 1a k -. x L 

2=1 r=0 
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k— 2 m m 

+C£0*-r J2 \W(r)\\ + ||^(* - 1)11 + W(k -1)1 (29) 

r=0 j=l 3=1 

We next show that the errors 4> l {k) satisfy ||0 l (A;)|| < a k L for all i and k. In view 
of the relations in ( 1271) . since x^(k) G Xj = X for all k and j, and the vector a l {k) is 
stochastic for all % and fc, it follows that v l (k) G X for all % and fc. Furthermore, by the 
projection property in Lemma [T](b), we have for all % and k, 

\\x\k + 1) - ^(AOH 2 < \\v\k) - a k di(k) - v\k)\\ 2 - \\x\k + 1) - (v\k) - a k di(k))\\ 2 

< alL 2 -mk)\\ 2 , 

where in the last inequality we use < L (see Assumption EJ). It follows that 

||0 ? (fc)|| < a k L for all i and k. By using this in relation ( |29|) . we obtain 

m k—2 

\\x l (k) -y(k)\\ < mC(3 k - l ^2\\x J {Q)\\+2mCL s ^(3 k - r a r + Aa k . l L. (30) 

3=1 r=0 

By taking the limit superior in relation (130]) and using the facts j3 k — > (recall 
< /3 < 1) and — > 0, we obtain for all i, 

k-2 

limsup \\x\k) —y(k)\\ < 2mCLlimsup (3 k ~ r a r 

k— »oo k-^oo „ 

r=0 

Finally, since < /? < 1 and lim^oo a k = 0, by Lemma [7] we have 

k-2 

>oo • 
r=0 

In view of the preceding two relations, it follows that limine ||ar*(fc) — y(k)\\ = for all 
i. 

(b) By multiplying the relation in (130]) with a k , we obtain 

m k—2 

a k \\x\k) - y{k)\\ < mCa^- 1 £ ||^(0) || + 2mCL £ /3 fc - r a fc a r + 4a fc a fc _!L. 

3=1 r=0 

By using < a 2 , + p 2 ( k ~^ and 2afca r < «^ + for any k and r, we have 

m k—2 

a k \\x l {k)-y{k) || < mC73 2(fc - 1} ||x J '(0) || +mCAa 2 k + mCL ^ /5 fc " r « 2 + 2L(« 2 + a 2 ^), 

3=1 r=0 

where A = Y^jLi + (l-p) ' Therefore, by summing and grouping some of the 

terms, we obtain 



lim V (3 k ~ r a r = 0. 



^a k \\x*{k)-y(k)\\ < r«C EWI 

fe=l \fe=l / 3=1 



26 



oo k—2 



k=l k=l r=0 



k ~ r a 2 r . 



In the preceding relation, the first term is summable since < (3 < 1. The second term 
is summable since a\ < oo. The third term is also summable by Lemma Hence, 

Y%=\ a k\\ xi ( k ) -v( k )\\ < °°- ■ 

Using Lemmas [6] and [HJ we next show that the iterates x l (k) converge to an optimal 
solution when we use a stepsize converging to zero fast enough. 

Proposition 4 Let Assumptions [2J [3], HI and [6] hold. Let {x l (k)} be the iter- 
ates generated by the algorithm ( |24|) -(l25l) with the stepsize satisfying J2k a k = 00 an d 
J2k o^i < oo. In addition, assume that the optimal solution set X* is nonempty. Then, 
there exists an optimal point x* G X* such that 

lim — x*\\ =0 for all i. 

k—*oo 

Proof. From Lemma [61 we have for z G X and all k, 

m mm 

i=l j=l i=l 

m m 

-2a k Y,{Mv l (k)) - Mz)) - Y,U l (k)\\ 2 - 

i=l i=l 

By dropping the nonpositive term on the right hand side, and by using the subgradient 
boundedness, we obtain 

mm m 

J2\\x\k + l)-z\\ 2 < Y,\W(k)-z\\ 2 + almL 2 -2a k Y,{U{v\k))-f l {y{k))) 

-2a k (f(y(k)) - f(z)) . 1 1 (31) 

In view of the subgradient boundedness and the stochasticity of the weights, it follows 

m 

\fi{v\k)) - f t (y(k))\ < L\\v*(k) -y(k)\\ <Lj2*fa)\\x j (k)-y(k)\\, 
implying, by the doubly stochasticity of the weights, that 

m m / m \ m 

£ |/M*0) - fMk))\ <lJ2 X>i(*) M*) - = lJ2 - 
t=i j=i \i=i J j=i 

By using this in relation ( 13T1) . we see that for any z G X, and all % and k, 

mm m 

J2\\^(k + l)-z\\ 2 < ^\\xi(k)-z\\ 2 + a 2 k mL 2 + 2a k L^\\x 3 (k)-y{k)\\ 
i=i j=i j=i 
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-2a k (f(y(k))-f(z)) 



By letting z = z* G X*, and by re-arranging the terms and summing these relations 
over some arbitrary window from K to N with K < N, we obtain for any z* G X*, 



N m 

_*l|2 



J2\\x\N+l)-z*\\ 2 + 2Y,(*k(f(y(k))-f(z*))<Y,\\x i (K) 

i=l k=K i=l 

N N m 

+mL 2 4 + 2 Lj2a k J2 ll^(*) - v(*)l|. (32) 

k=K k=K j=l 



By letting K — 1 and iV — > oo in relation (1321) . and using ^fcli a t < 00 an d 

r- 



Y^k=\ a k J2T=i \\ x ^(k) ~ v{k)\\ < 00 [which follows by Lemma [8], we obtain 



(f(y(k)) - /(«*)) < 00. 



fe=i 



Since x J (/c) G X for all j, we have y(k) G X for all k. Since z* G X*, it follows that 
f(y(k)) — f* > for all fc. This relation, the assumption that Ylk=i ak = °°> anc ^ 
Er=i«fc (/(l/(*0) ~ /(**)) < 00 imply 

hminf (/(y(£;)) -r)=0. (33) 

We next show that each sequence {x l (k)} converges to the same optimal point. By 
dropping the nonnegative term involving f(y(k)) — f(z*) in f[3"2"j) . we have 

m m N N m 

WAN + 1) - z*\\ 2 < J2 \\AK) -z*\\ 2 + mL 2 ^ a\ + 2L ^ a k ^ \\x j (k) - y(Jfc) ||. 
i=i i=i fc=if fe=js: 3=1 

Since E fe a£ < 00 and Y2k=i ak ^J=i \\ x *(fy ~ J/WII < 00 ' ^ follows that the sequence 
{x l (/c)} is bounded for each i, and 

m m 

limsup V Ha^JV+l) - z*\\ 2 < liminf V Wx^K) - z*\\ 2 for alii. 

i=l i=l 

Thus, the scalar sequence {$^=1 ll 37 *^) — Z *\W ls convergent for every z* G X*. By 
Lemma [8], we have linn^oo H^t^) ~~ y{k)\\ = 0. Therefore, it also follows that {y{k)} is 
bounded and the scalar sequence {||y(A;) — z*\\} is convergent for every z* G X*. Since 
y(k) is bounded, it must have a limit point, and in view of liminffe^oo f(y(k)) = f* 
[cf. Eq. (|33|) ] and the continuity of / (due to convexity of / over M n ), one of the limit 
points of {y{k)} must belong to X*; denote this limit point by x*. Since the sequence 
{||y(fc) — x*\\} is convergent, it follows that y{k) can have a unique limit point, i.e., 
lim^oo y{k) = x*. This and linn^oo ||x J (/c) — y(k)\\ = imply that each of the sequences 
{x l (k)} converges to the same x* G X*. ■ 
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4.1.2 Convergence for uniform weights 

We next consider a version of the projected subgradient algorithm ( J24l) -( j25l) for the case 
when the agents use uniform weights, i.e., a l j(k) = ^ for all i, j, and k > 0. We show 
that the estimates generated by the method converge to an optimal solution of problem 
f l2Tj) under some conditions. In particular, we adopt the following assumption in our 
analysis. 

Assumption 7 For each i, the local constraint set Xi is a compact set, i.e., there exists 
a scalar B > such that 

||x|| < B for all x E Xi and all i. 



An important implication of the preceding assumption is that, for each i, the sub- 
gradients of the function fi at all points i6X; are uniformly bounded, i.e., there exists 
a scalar L > such that 

\\g\\ < L for all g £ dfi(x), all and all i. (34) 

Under this and the interior point assumption on the intersection set X = x Aj (cf. 
Assumption [1]), we have the following result. 

Proposition 5 Let Assumptions [Hand [7] and hold. Let {x l (k)} be the iterates generated 
by the algorithm ( |24l -(l25l) with the weight vectors a l (k) = (l/m, . . . , 1/m)' for all i and 
k, and the stepsize satisfying J2k a k = oo and J2k a t < 00 ■ Then, the sequences {x l (k)}, 
i = 1, . . . , m, converge to the same optimal point, i.e., 

lim x l (k) = x* for some x* £ X* and all i. 

fc-^oo 



Proof. By Assumption [3, each set Xi is compact, which implies that the intersection set 
X = H^Xi is compact. Since each function fi is continuous (due to being convex over 
M n ), it follows from Weierstrass' Theorem that problem (12T|) has an optimal solution, 
denoted by z* £ X. By using Lemma [6] with z = z*, we have for all i and k > 0, 

m mm 

^Hx^ + l)-^!! 2 < ^\\x\k) - z*\\ 2 + a 2 k J2\\ d i( k )\\ 2 

i=l i=l i=l 

m m 

i=l i=l 

For any k > 0, define the vector s(k) by 

e + o e + o 
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where x(k) = — Y^iLi xl (k), e = Y^jLi dist(f (k), Xj), and 5 is the scalar given in As- 
sumption [1] (cf. Lemma [2]). By using the subgradient boundedness [see (1341) ] and adding 
and subtracting the term 2a k fi{ s {k)) m Eq. (J35l) . we obtain 

mm m 

^Uaty + l)-**!!' < ^||^(A;)-^|| 2 + ^mL 2 -^||^(A;)|| 2 

i=l i=l i=l 

m m 

-2a fc ]T (fi(s(k)) - friz*)) - 2a fc ]T (/,(««(*)) - /«(*(*))) . 

i=l i=l 

Using the subgradient definition and the subgradient boundedness assumption, we fur- 
ther have 

\Mv*{k)) - fi(s(k))\ < L\\v l (k) - s(k)\\ for all i and k. 

Combining these relations with the preceding and using the notation / = Yl^Li fii we 
obtain 

mm m 

i=l i=l i=l 

m 

-2a k (f(s(k)) - f(z*)) + 2a k L ]T - s(A;)||. (36) 



i=l 



Since the weights are all equal, from relation ([21]) we have v l (k) = x(k) for all i and k. 
Using Lemma Mp) with the substitution s = s(k) and x = x{k) = ^ Y^j=\ x K^)i we 
obtain 

1 m m 

\\v\k) - s{k)\\ < — (jjT - (5Zdist(x(jfc),X,-)) for all i and fc. 

Since x J (A;) G Xj, we have dist(x(fc), X,) < \\x(k) — x^(k + 1)|| for all j and /c, Further- 
more, since x G X C Xj for all j, using Assumption [7J we obtain ^ (/c) — x|| < 2B. 
Therefore, for all % and k, 

\\v\k) -s(k)\\ < ^^disk{x{k),Xj) < ^-jr\\x(k) -x j (k + 1)||. (37) 
i=i i=i 

Moreover, we have x(k) = v^(k) for all j and fc, implying 

\\x j (k + l) -x(k)\\ = \\x j (k+ 1) - (^'(Jfe) -a fc dj(fc))|| + a fc ||dj-(A;)||. 

In view of the definition of the error term ^(k) in ( 1271) and the subgradient boundedness, 
it follows 

\\x*(k + l)-x{k)\\ < ||^'(fc)||+a fc L, 
which when substituted in relation (I37|) yields 



\v i {k)-8(k)\\<^-[a k mL + ^2\\<fP{k)\\] for all i and k. (31 
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We now substitute the estimate (138!) in Eq. (136|) and obtain for all k, 

mm m 

+ ~z*\\ 2 < ^||xXA;)-^|| 2 + 4mL 2 -^||^(A;)|| 2 

i=l i=l i=l 

-2a t (/(«(*)) - l(z')) + ^H-ML a 2 

+^^±mm. (39) 



5 

i=l 



Note that for each i, we can write 



< M a ^ BL \ +\w(k)\?- 

Therefore, by summing the preceding relations over i, we have for all k, 
4ai.mBI A„ ,,„.„ 8m 3 B 2 L 2 9 1 A„ „■,,.„, 

— ^ — E 11* (*) ii ^ — ^ — + 2 E (*) ii . 

i=l i=l 

which when substituted in Eq. ( |39|) yields 

Ell^ + i)-**ll 2 < ^\\Ak)-z*f + Cal- l -^\\<j>\k)\\ 2 

-2a* (/(*(*)) - /(**)) , 

where C = ml? + Am2 ^ L2 -\- Sm 3 B 2 L 2 ^ gy re-arranging the terms and summing the 
preceding relations over k for k = K, . . . , N for some arbitrary K and N with K < N, 
we obtain 

N m N 



t=l fc=iC i=l k=K 

m N 

< eii^w-^t + cE^- ( 4 °) 

i=l k=K 

By setting K = and letting TV — > oo, in view of J2k a t < °°) we see that 

1 oo m oo 

o E E ii^wii 2 + 2 E «* - /CO) < °°- 



2 

fc=0 1=1 fc=0 



Since by Lemma Eta) we have s(fc) G X, the relation YlT=i (fi( s (^)) ~ fi( z *)) — holds 
for all k, thus implying that 



-. oo m 

oEEii^)h 2<oo > 



2 

fc=0 i=l 



31 



J> fc (/(*(*))- /(**)) < 00. 
fc=0 

In view of the former of the preceding two relations, we have 

lim <p l (k) = for all i, 

A; —too 

while from the latter, since ^2 k ak = oo and f(s(k)) — /* > [because s(k) G X for all 
k], we obtain 

liminf f{s{k)) = f*. (41) 

fc— >oo 

Since (j) l {k) — > for all i and a*,. — ► [in view of J2k a l < °°]> from Eq. ( 1381) it follows 
that 

lim ||f*(A;) — s(fc)|| = for all i. 

A; —too 

Finally, since x l (k+l) = v % (k)—akd i (k)+(p t {k) [see (}2"71) ]. in view of — > 0, ||dj(A;)|| < L, 
and l (A;) — > 0, we see that lim^oo \\x l (k+1) —v" 1 (k) \\ = for all i. This and the preceding 
relation yield 

lim \\x\k + 1) - s(k)\\ = for all %. 

A;— too 

We now show that the sequences {x l (k)},i = 1, . . . ,m, converge to the same limit 
point, which lies in the optimal solution set X*. By taking limsup as N — > oo in relation 
(]40p and then liminf as K — > oo, (while dropping the nonnegative terms on the right 
hand side there), since ^ k a\ < oo, we obtain for any z* G X*, 



limsup J] \\x i (N+ 1) - z*\\ 2 < liminf (K) 



* 1 1 2 
2 II ) 

w^oo r-r if-^oo z — / 

implying that the scalar sequence {J^^! ||x*(fc) — z*\\} is convergent for every z* G X*. 
Since \\x l (k + 1) — s(A;)|| — * for all z, it follows that the scalar sequence {||s(fc) — ^*||} 
is also convergent for every z* G X*. In view of liminffc^oo f(sk) = f* [cf. Eq. (1411) ]. it 
follows that one of the limit points of {sfc} must belong to X*; denote this limit by x*. 
Since {||s(A;) — z*\\} is convergent for z* = x*, it follows that limfc^oo s(k) = x*. This 
and ||x*(/c + 1) — s(k)\\ — > for all % imply that each of the sequences {x l (k)} converges 
to a vector x*. with x* G X*. m 



5 Conclusions 

We studied constrained consensus and optimization problems where agent z's estimate 
is constrained to lie in a closed convex set X^. For the constrained consensus problem, 
we presented a distributed projected consensus algorithm and studied its convergence 
properties. Under some assumptions on the agent weights and the connectivity of the 
network, we proved that each of the estimates converge to the same limit, which belongs 
to the intersection of the constraint sets Xj. We also showed that the convergence rate 
is geometric under an interior point assumption for the case when agent weights are 
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time-invariant and uniform. For the constrained optimization problem, we presented a 
distributed projected subgradient algorithm. We showed that with a stepsize converging 
to zero fast enough, the estimates generated by the subgradient algorithm converges to 
an optimal solution for the case when all agent constraint sets are the same and when 
agent weights are time-invariant and uniform. 

The framework and algorithms studied in this paper motivate a number of interesting 
research directions. One interesting future direction is to extend the constrained opti- 
mization problem to include both local and global constraints, i.e., constraints known by 
all the agents. While global constraints can also be addressed using the "primal projec- 
tion" algorithms of this paper, an interesting alternative would be to use "primal-dual" 
subgradient algorithms, in which dual variables (or prices) are used to ensure feasibility 
of agent estimates with respect to global constraints. Such algorithms have been stud- 
ied in recent work [19] for general convex constrained optimization problems (without a 
multi-agent network structure). 

Moreover, in this paper, we presented convergence results for the distributed subgra- 
dient algorithm for two gents have time- varying weights but the same constraint 
set; and agents have time-invariant uniform weights and different constraint sets. When 
agents have different constraint sets, the convergence analysis relies on an error bound 
that relates the distances of the iterates (generated with constant uniform weights) to 
each Xi with the distance of the iterates to the intersection set under an interior point 
condition (cf. Lemma [2]). This error bound is also used in establishing the geometric 
convergence rate of the projected consensus algorithm with constant uniform weights. 
These results can be extended using a similar analysis once an error bound is established 
for the general case with time-varying weights. We leave this for future work. 
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